Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation
نویسندگان
چکیده
CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense, pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localization information. (2) We describe a multi-resolution reconstruction architecture based on a Laplacian pyramid that uses skip connections from higher resolution feature maps and multiplicative gating to successively refine segment boundaries reconstructed from lower-resolution maps. This approach yields state-of-the-art semantic segmentation results on the PASCAL VOC and Cityscapes segmentation benchmarks without resorting to more complex random-field inference or instance detection driven architectures.
منابع مشابه
Laplacian Reconstruction and Refinement for Semantic Segmentation
CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense, pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localizat...
متن کاملMultiple Image Fusion Using Laplacian Pyramid
Image fusion is an important visualization technique of integrating coherent spatial and temporal information into a compact form. Laplacian fusion is a process that combines regions of images from different sources into a single fused image based on a salience selection rule for each region. In this paper, we proposed an algorithmic approach using a Laplacian and Gaussian pyramid to better loc...
متن کاملFrame reconstruction of the Laplacian pyramid
We study the Laplacian pyramid (LP) as a frame operator, and this reveals that the usual reconstruction is suboptimal. With orthogonal filters, the LP is shown to be a tight frame, thus the optimal linear reconstruction using the dual frame operator has a simple structure as symmetrical with the forward transform. For more general cases, we propose an efficient filter bank for reconstruction in...
متن کاملNoise Processing for Simple Laplacian Pyramid Synthesis Based on Dual Frame Reconstruction
The Laplacian pyramid (LP) provides a frame expansion. Thus, there exist infinitely many synthesis operators which achieve perfect reconstruction in the absence of quantization. However, if the subbands are quantized in the open-loop mode then the dual frame synthesis operator, which is the pseudo-inverse of the analysis operator, minimizes the mean squared error (MSE) in the reconstruction. No...
متن کاملUnsupervised Representation Learning with Laplacian Pyramid Auto-encoders
Scale-space representation has been popular in computer vision community due to its theoretical foundation. The motivation for generating a scale-space representation of a given data set originates from the basic observation that real-world objects are composed of different structures at different scales. Hence, it’s reasonable to consider learning features with image pyramids generated by smoo...
متن کامل